Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Estimating Digitization Costs in Digital Libraries Using DiCoMo

Identifieur interne : 000721 ( Main/Exploration ); précédent : 000720; suivant : 000722

Estimating Digitization Costs in Digital Libraries Using DiCoMo

Auteurs : Alejandro Bia [Espagne] ; Rafael Mu Oz [Espagne] ; Jaime G Mez [Espagne]

Source :

RBID : ISTEX:0310DDA70CB1F641B7B6AAF78452610506B1FE20

Abstract

Abstract: The estimate of digitization costs is a very difficult task. It is difficult to make exact predictions due to the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and the times involved in the development of their contents. The common practice when we start digitizing a new collection is to set a schedule, and a firm commitment to fulfill it (both in terms of cost and deadlines), even before the actual digitization work starts. As it happens with software development projects, incorrect estimates produce delays and cause costs overdrafts. Based on methods used in Software Engineering for software development cost prediction like COCOMO and Function Points, and using historical data gathered during five years at the Miguel de Cervantes Digital Library, during the digitization of more than 12.000 books, we have developed a method for time and cost estimates named DiCoMo (Digitization Costs Model) for digital content production in general. This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning and OCR, and undergoing human proofreading and error correction, or for the production of digital facsimiles (scanning without OCR). The accuracy of the estimates improve with time, since the algorithms can be optimized by making adjustments based on historical data gathered from previous tasks.

Url:
DOI: 10.1007/978-3-642-15464-5_15


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Estimating Digitization Costs in Digital Libraries Using DiCoMo</title>
<author>
<name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
</author>
<author>
<name sortKey="Mu Oz, Rafael" sort="Mu Oz, Rafael" uniqKey="Mu Oz R" first="Rafael" last="Mu Oz">Rafael Mu Oz</name>
</author>
<author>
<name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:0310DDA70CB1F641B7B6AAF78452610506B1FE20</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-15464-5_15</idno>
<idno type="url">https://api.istex.fr/document/0310DDA70CB1F641B7B6AAF78452610506B1FE20/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000149</idno>
<idno type="wicri:Area/Istex/Curation">000147</idno>
<idno type="wicri:Area/Istex/Checkpoint">000301</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Bia A:estimating:digitization:costs</idno>
<idno type="wicri:Area/Main/Merge">000726</idno>
<idno type="wicri:Area/Main/Curation">000721</idno>
<idno type="wicri:Area/Main/Exploration">000721</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Estimating Digitization Costs in Digital Libraries Using DiCoMo</title>
<author>
<name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>CIO/DEMI, Miguel Hernández University</wicri:regionArea>
<wicri:noRegion>Miguel Hernández University</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author>
<name sortKey="Mu Oz, Rafael" sort="Mu Oz, Rafael" uniqKey="Mu Oz R" first="Rafael" last="Mu Oz">Rafael Mu Oz</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>DLSI, University of Alicante</wicri:regionArea>
<wicri:noRegion>University of Alicante</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author>
<name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>DLSI, University of Alicante</wicri:regionArea>
<wicri:noRegion>University of Alicante</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">0310DDA70CB1F641B7B6AAF78452610506B1FE20</idno>
<idno type="DOI">10.1007/978-3-642-15464-5_15</idno>
<idno type="ChapterID">15</idno>
<idno type="ChapterID">Chap15</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: The estimate of digitization costs is a very difficult task. It is difficult to make exact predictions due to the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and the times involved in the development of their contents. The common practice when we start digitizing a new collection is to set a schedule, and a firm commitment to fulfill it (both in terms of cost and deadlines), even before the actual digitization work starts. As it happens with software development projects, incorrect estimates produce delays and cause costs overdrafts. Based on methods used in Software Engineering for software development cost prediction like COCOMO and Function Points, and using historical data gathered during five years at the Miguel de Cervantes Digital Library, during the digitization of more than 12.000 books, we have developed a method for time and cost estimates named DiCoMo (Digitization Costs Model) for digital content production in general. This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning and OCR, and undergoing human proofreading and error correction, or for the production of digital facsimiles (scanning without OCR). The accuracy of the estimates improve with time, since the algorithms can be optimized by making adjustments based on historical data gathered from previous tasks.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Espagne</li>
</country>
</list>
<tree>
<country name="Espagne">
<noRegion>
<name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
</noRegion>
<name sortKey="Bia, Alejandro" sort="Bia, Alejandro" uniqKey="Bia A" first="Alejandro" last="Bia">Alejandro Bia</name>
<name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
<name sortKey="G Mez, Jaime" sort="G Mez, Jaime" uniqKey="G Mez J" first="Jaime" last="G Mez">Jaime G Mez</name>
<name sortKey="Mu Oz, Rafael" sort="Mu Oz, Rafael" uniqKey="Mu Oz R" first="Rafael" last="Mu Oz">Rafael Mu Oz</name>
<name sortKey="Mu Oz, Rafael" sort="Mu Oz, Rafael" uniqKey="Mu Oz R" first="Rafael" last="Mu Oz">Rafael Mu Oz</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000721 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000721 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:0310DDA70CB1F641B7B6AAF78452610506B1FE20
   |texte=   Estimating Digitization Costs in Digital Libraries Using DiCoMo
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024